-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bidirectional packet capture #6882
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Aryan Bakliwal <[email protected]>
Signed-off-by: Aryan Bakliwal <[email protected]>
@hangyan please let me know what you think about this |
…antrea-io#6877) Bumps the golang-org-x group with 1 update: [golang.org/x/net](https://github.com/golang/net). Updates `golang.org/x/net` from 0.32.0 to 0.33.0 - [Commits](golang/net@v0.32.0...v0.33.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production update-type: version-update:semver-minor dependency-group: golang-org-x ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…oup (antrea-io#6880) Bumps the ginkgo group with 1 update: [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo). Updates `github.com/onsi/ginkgo/v2` from 2.22.0 to 2.22.1 - [Release notes](https://github.com/onsi/ginkgo/releases) - [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md) - [Commits](onsi/ginkgo@v2.22.0...v2.22.1) --- updated-dependencies: - dependency-name: github.com/onsi/ginkgo/v2 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: ginkgo ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Aryan Bakliwal <[email protected]>
Hey @hangyan @antoninbas, I made the changes in the bpf code according to the one generated by tcpdump. When I try to test the bidirectional packet capture, it fails with this log
Could you please take a look and help me identify what might be going wrong? Also, golangci-lint is giving me this error even though I have changed the
|
this error was reported by golangci on windows, we have a |
Signed-off-by: Aryan Bakliwal <[email protected]>
I’ve added the Please let me know if any adjustments are needed or if there's anything else I should update or add. Captured packets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add an e2e test with the Both
direction?
Signed-off-by: Aryan Bakliwal <[email protected]>
7b17816
to
7eb34de
Compare
@antoninbas @hangyan I have refactored the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your ongoing work on this
// (004) ld [26] # Load 4B at 26 (source address) | ||
// (005) jeq #0xaf40102 jt 6 jf 15 # If bytes match(10.244.0.2), goto #6, else #15 | ||
// (006) ld [30] # Load 4B at 30 (dest address) | ||
// (007) jeq #0xaf40103 jt 8 jf 26 # If bytes match(10.244.0.3), goto #8, else #26 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would the code be easier to generate if we jumped to (015) in case this instruction (007) was not a match?
the fact that we know we can skip the second or
operand in this case is an optimization. Not saying it's not a good optimization, but if using a "fixed" offset makes code generation much easier, maybe we should do it for now.
I'm thinking about the discussion we just had during the community meeting. That being said, I can see why generating the same code as tcpdump is valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AryanBakliwal Please take a look at this ^
Taking a concrete example:
tcpdump -d "ip proto 6 and ((src host 10.244.1.2 and dst host 10.244.1.3) or (src host 10.244.1.3 and dst host 10.244.1.2))"
This is the code generated by tcpdump:
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 12
(002) ldb [23]
(003) jeq #0x6 jt 4 jf 12
(004) ld [26]
(005) jeq #0xaf40102 jt 6 jf 8
(006) ld [30]
(007) jeq #0xaf40103 jt 11 jf 12
(008) jeq #0xaf40103 jt 9 jf 12
(009) ld [30]
(010) jeq #0xaf40102 jt 11 jf 12
(011) ret #262144
(012) ret #0
If we generate this code instead, I believe it will be simpler:
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 12
(002) ldb [23]
(003) jeq #0x6 jt 4 jf 12
(004) ld [26]
(005) jeq #0xaf40102 jt 6 jf 8
(006) ld [30]
(007) jeq #0xaf40103 jt 11 jf 8 <- less efficient than tcpdump's version but easier to implement?
(008) jeq #0xaf40103 jt 9 jf 12
(009) ld [30]
(010) jeq #0xaf40102 jt 11 jf 12
(011) ret #262144
(012) ret #0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas Actually, the real problem is the calculation of skipTrue
(for which I created the function calculateSkipTrue
) when we want to accept the packet. But calculateSkipTrue
depends on many parameters [srcPort
, dstPort
, direction
, which port is loaded in the register (srcPort
or dstPort
) and if we are calling this function from the first half (instructions that check request) or the second half (instructions that check response)] to calculate the number of instructions to skip to jump to the returnKeep
instruction. This makes the code generation very complex if generating instruction set exactly like tcpdump.
Taking each case:
- ICMP (both
srcPort
anddstPort
are 0)
tcpdump -d "ip proto 6 and ((src host 10.244.1.2 and dst host 10.244.1.3) or (src host 10.244.1.3 and dst host 10.244.1.2))"
Code generated by tcpdump:
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 12
(002) ldb [23]
(003) jeq #0x6 jt 4 jf 12
(004) ld [26]
(005) jeq #0xaf40102 jt 6 jf 8
(006) ld [30]
(007) jeq #0xaf40103 jt 11 jf 12 <- need to jump to 11 from 7, difference to calculate = 3
(008) jeq #0xaf40103 jt 9 jf 12
(009) ld [30]
(010) jeq #0xaf40102 jt 11 jf 12 <- skipTrue = 0
(011) ret #262144
(012) ret #0
- TCP (
srcPort
= 0 anddstPort
> 0)
tcpdump -d "ip proto 6 and ((src host 10.244.1.2 and dst host 10.244.1.3 and dst port 80) or (src host 10.244.1.3 and dst host 10.244.1.2 and src port 80))"
Code generated by tcpdump:
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 22
(002) ldb [23]
(003) jeq #0x6 jt 4 jf 22
(004) ld [26]
(005) jeq #0xaf40102 jt 6 jf 13
(006) ld [30]
(007) jeq #0xaf40103 jt 8 jf 22 <- jump to next instruction since port(s) not zero
(008) ldh [20]
(009) jset #0x1fff jt 22 jf 10
(010) ldxb 4*([14]&0xf)
(011) ldh [x + 16]
(012) jeq #0x50 jt 21 jf 22 <- need to jump to 21 from 12, difference to calculate = 8
(013) jeq #0xaf40103 jt 14 jf 22
(014) ld [30]
(015) jeq #0xaf40102 jt 16 jf 22
(016) ldh [20]
(017) jset #0x1fff jt 22 jf 18
(018) ldxb 4*([14]&0xf)
(019) ldh [x + 14]
(020) jeq #0x50 jt 21 jf 22 <- skipTrue = 0
(021) ret #262144
(022) ret #0
Note that this is similar to when (srcPort
> 0 and dstPort
= 0) and (srcPort
> 0 and dstPort
> 0).
Since the problem is to jump to the returnKeep
instruction, why not add it again to mitigate this complex calculation and jump to it easily. For example:
- ICMP:
Code that can be generated more simply
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 12
(002) ldb [23]
(003) jeq #0x6 jt 4 jf 12
(004) ld [26]
(005) jeq #0xaf40102 jt 6 jf 8
(006) ld [30]
(007) jeq #0xaf40103 jt 8 jf 13 <- just skip to next instruction if true
(008) ret #262144
(009) jeq #0xaf40103 jt 10 jf 13
(010) ld [30]
(011) jeq #0xaf40102 jt 12 jf 13
(012) ret #262144
(013) ret #0
- TCP (
srcPort
= 0 anddstPort
> 0)
Code that can be generated more simply
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 23
(002) ldb [23]
(003) jeq #0x6 jt 4 jf 23
(004) ld [26]
(005) jeq #0xaf40102 jt 6 jf 13
(006) ld [30]
(007) jeq #0xaf40103 jt 8 jf 23
(008) ldh [20]
(009) jset #0x1fff jt 23 jf 10
(010) ldxb 4*([14]&0xf)
(011) ldh [x + 16]
(012) jeq #0x50 jt 13 jf 23 <- just skip to next instruction if true
(013) ret #262144
(014) jeq #0xaf40103 jt 15 jf 23
(015) ld [30]
(016) jeq #0xaf40102 jt 17 jf 23
(017) ldh [20]
(018) jset #0x1fff jt 23 jf 19
(019) ldxb 4*([14]&0xf)
(020) ldh [x + 14]
(021) jeq #0x50 jt 22 jf 23
(022) ret #262144
(023) ret #0
Let me know what you think about this approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding an extra return instruction sounds fine to me.
I think what we are going for is some code that looks like this in the end:
if direction == SourceToDest {
generateFilters(offset, srcIP, dstIP, srcPort, dstPort)
} else if direction == DestToSource {
generateFilters(offset, dstIP, srcIP, dstPort, srcPort)
} else {
generateFilters(offset, srcIP, dstIP, srcPort, dstPort)
// ...
generateFilters(offset, dstIP, srcIP, dstPort, srcPort)
}
So if we can get there and have some generate code that looks reasonable and works, that would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas I have made the requested changes and refactored the code as you suggested and as per our discussion. PTAL and let me know if there is anything else to update or add.
Signed-off-by: Aryan Bakliwal <[email protected]>
// (012) jeq #0x7b jt 13 jf 27 # TCP Source Port: If 123, goto #13, else #26 | ||
// (013) ldh [x + 16] # Load 2B at x+16 (TCP dst port) | ||
// (014) jeq #0x7c jt 15 jf 27 # TCP dst port: If 123, goto #15, else #26 | ||
// (015) ret #262144 # MATCH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a stupid question, but if this is the only change we have, it doesn't seem worth it? Why can't we calculate the jump size using size - curOffset - 1
given that:
- we always compute the full size up front
- we always know the current offset ("where we are")
- the instruction before last is always a
MATCH
return instruction
What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas You're right, the jump size can indeed be calculated easily with the available information (full size and current offset).
The main challenge, however, isn't the calculation itself but rather when to perform it, meaning in which instructions do we need skipTrue
set to size - curOffset - 1
.
And if a helper function is used for calculation, then it depends on which instruction is calling the helper function, what is the direction and what is the value of ports. Adding this extra instruction helps simplify the instruction set generation and avoids unnecessary complexity.
The challenges I faced:
For request packets in Both
direction, there are 3 cases (assuming the 3 instructions - dstIP. srcPort and srcPort comparison instructions are calling a helper function to get jump size/skipTrue
value):
- When both ports are 0, the destination IP comparison instruction performs the calculation to jump to
MATCH
. - When only one port is 0, the destination IP comparison sets skipTrue to 0, and the comparison for the defined port performs the calculation to jump to
MATCH
. - When neither port is 0, all comparisons set skipTrue to 0, except for the destination port comparison, which performs the calculation to jump to
MATCH
.
In SourceToDestination
, DestinationToSource
and response packets in Both
, no calculations are needed, and these 3 instructions (dstIP, srcPort and dstPort comparison) jump to MATCH
step with skipTrue
set to 0.
The helper function that calculates the jump size or returns the value of skipTrue
requires additional context which is a bunch of parameters such as the calling instruction, direction, ports, size and current offset. Adding this logic in the code makes it difficult to extend in the future if we filter traffic on more parameters and also increases complexity.
Hope that this explanation makes sense and adding an extra MATCH
instruction seems reasonable. Any suggestions on how we could simplify or approach this differently? I'm open to any ideas you might have for improving the implementation of calculation of skipTrue
value.
// 'ip proto 6 and ((src host 10.244.1.2 and dst host 10.244.1.3 and src port 123 and dst port 124) or (src host 10.244.1.3 and dst host 10.244.1.2 and src port 124 and dst port 123))' | ||
// And using `tcpdump -i <device> '<filter>' -d` will generate the following BPF instructions: | ||
// (000) ldh [12] # Load 2B at 12 (Ethertype) | ||
// (001) jeq #0x800 jt 2 jf 26 # Ethertype: If IPv4, goto #2, else #26 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in comment section, else #26
should be else $27
?
return count | ||
} | ||
|
||
func compareIPPort(srcAddrVal, dstAddrVal uint32, size, curLen uint8, srcPort, dstPort uint16, useSkipFalse bool) []bpf.Instruction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function name is not very straightforward, can you use a new one? we may also need some comments.
Also i think the param useSkipFalse
maybe misleading, we always use skip false
, the difference is skipFalse to ret #0
or skipFalse to some intermediate instructions.
fixes: #6862
Added
bidirectional
field in packet capture CR spec.For testing, I created two pods and pinged one from the other.
Screenshot of the
![image](https://private-user-images.githubusercontent.com/106430579/398050679-34262988-987f-43ba-870c-ca82cc02089d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg3ODk5NTQsIm5iZiI6MTczODc4OTY1NCwicGF0aCI6Ii8xMDY0MzA1NzkvMzk4MDUwNjc5LTM0MjYyOTg4LTk4N2YtNDNiYS04NzBjLWNhODJjYzAyMDg5ZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjA1JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIwNVQyMTA3MzRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0yNmZjYzMyNmUwY2YzMTdkZTcyMmFmYTQxZTBhYzgwY2U3NTUzMTc2ZGQxZTBkOWMzMDEyYjI0MWFjODNjZGQ3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.e9pBjlkHGkNuOmmrrIE3O1iIvaqtj9tXGl4u9fnIUrg)
.pcapng
output file.